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^ ; Abstract 

(j , In this paper, we present the Gaussian process regression as the predictive model 

for Quality-of-Service (QoS) attributes in Web service systems. The goal is to 
predict performance of the execution system expressed as QoS attributes given 
existing execution system, service repository, and inputs, e.g., streams of requests. 
In order to evaluate the performance of Gaussian process regression the simula- 
tion environment was developed. Two quality indexes were used, namely, Mean 
. Absolute Error and Mean Squared Error. The results obtained within the exper- 

\Q " iment show that the Gaussian process performed the best with linear kernel and 

statistically significantly better comparing to Classification and Regression Trees 
(CART) method. 
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1 Introduction 



Performance prediction in web service systems is one of the most important issues in modern com- 
puter networks which is still insufficiently solved by well-known methods because of the gap be- 
tween theoretical considerations and applications. This research explores this issue. In general, 
web service systems consist of the following layers ifTTl : (i) an execution layer which controls the 
execution of composite Web services and manages dataflow between them, and (ii) an application 
service layer which delivers requested functionalities to clients. Web services are designed accord- 
ing to service oriented computing (SOC) paradigm |9| and represent encapsulated functionalities of 
applications. 

In this paper, we focus on the execution layer only. Our goal is to predict performance of the 
execution system expressed as Quality-of-Service (QoS) attributes given existing execution system, 
service repository, and inputs, e.g., streams of requests. The predicted performance can be used not 
only for personalization of services 1 19 1 but most of all for service selection lflOl[T5l and resource 
allocation |fl"3l l22l . For example, modelling dependency between QoS attributes and streams of 
requests allows to allocate computational resources in an optimal way. Otherwise other techniques 
are needed, e.g., change detection methods lTT~4l [T8l l20l . However, a predictive model for QoS 
attributes can be used as an objective function in an optimization task for resource allocation. 

According to above facts the proposition of the predictive model becomes a crucial issue. It can be 
assumed that the execution system with fixed computational resources and given inputs performs 
roughly in a deterministic manner. Nevertheless, internal and unknown processes within the execu- 
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tion system introduce random noise and thus the QoS attributes are random variables as well. Hence, 
a probabilistic model seems to be the best suited in the considered application. 

Recently, in the literature of machine learning, a non-parametric regression model called Gaussian 
process was introduced 0[8][12)- Gaussian processes are considered as one of the most successful 
regression models applied in many domains, e.g., biosystems [1], predictive control for chemical 
plants [7], hydraulic systems |6|, learning inverted pendulum [5 1 and non-linear system identification 



The main contribution of this paper is twofold. First, an application of Gaussian process to predicting 
performance of the execution system in Web service system is presented. Second, a simulation 
environment for Web traffic is proposed. 

The paper is organized as follows. In Sect. 2 the problem of performance prediction in Web service 
systems is stated. In Sect. 3 details about Gaussian processes models are outlined. In Sect. 3 the 
simulation environment is described and experiments are conducted. At the end conclusions are 
drawn. 

2 Prediction of QoS in Web service systems 

Let x e X denote a D-dimensional vector of input variables to the execution system. For example, 
inputs are total sizes of demands from D classes maintained in queues to the execution system, 
X = R®. Outputs of the execution systems are denoted by y 6 y and correspond to QoS attributes, 
e.g., time spent in the execution system (so called latency)^ 

Further, we assume that there exists a dependency between inputs and outputs. However, with- 
out knowing processes responsible for generating teletraffic and internal processes governing the 
execution systems we should consider noise in the model. The dependency between inputs and 
QoS attributes can be seen as a regression model, hence the target variable y is given by a function 
/ : X — > K and a Gaussian additive noise 



where e is a zero mean Gaussian random variable with precision j3 l , e ~ W(-|0, (3 x ). 

The prediction task is to return an output y for given new inputs x and N historical observations 
(data) V = {(x„, y n )}n=v B ecause we consider a probabilistic model (Q}, we need to calculate the 
following predictive distribution 



In order to calculate the predictive distribution we are supposed to give a priori distribution of 
dependencies /. As we will see shortly, this is analytically tractable as long as prior is Gaussian. 

In the context of Web service systems, we want to predict QoS attributes, e.g., latency, for given 
inputs. Here we do not consider dynamics of input streams, thus the efficiency of our approach 
relies on proper formulation of input variables and calculation of target variable. 

3 Gaussian process regression 
3.1 The model 

The idea of Gaussian processes is to put a prior distribution on function / and learn the dependen- 
cies basing on available data [J3]|8][T2)- However, the Gaussian process is a non-parametric model 
and thus there is no need to formulate any fixed relationships between inputs and target variable. 

'Multivariate regression model can be treated as a problem of several one-dimensional regression models 
1 3 1 and thus, for further simplicity, we will consider only one output (target variable), i.e., y. 
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y = /(*) + e, 
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The non-linear regression model using Gaussian process (called Gaussian process regression) is as 
follows: 



V = /(x) +e, 

/~0P(-|O,fc(x,x'))> (3) 

where QV denotes the Gaussian process, k(-, •) is the covariance (kernel) function. 

Now the predictive distribution (f2]) is analytically tractable because prior on / is QV, likelihood is 
Gaussian, and a posteriori distribution on / is also a QV. 

Let y denote a column of output observations and f - a column of /(x n ), n = 1 . . . N. From the 
definition of Gaussian process the marginal distribution p(f ) is as follows: 

p(f)=JV(f|0,K), (4) 

where - column of zeros, K - Gramm matrix, i.e., K n ,m = k(x n , x m ). 
Similarly, the distribution of y conditioned by f is the following: 

p(y|f)=AA(y|f,r 1 Iw), (5) 

where Ijy is N x N unit matrix. 
The marginal distribution p(y) equals 

P(y) =AA(y|0,C), (6) 
where C is a matrix such that C n . m = K n>m + (3~ 1 6 n . m , and 5 n<m is a Kronecker's delta. 

Because all distributions are Gaussian, hence the predictive distributior0p(y|x, y) is Gaussian dis- 
tribution with mean and covariance given by If3~| [l2l 



m(x) = k T C 'y, (7) 
cr 2 (x) = /j(x,x) + /3- 1 -k T C- 1 k, (8) 

where k is a vector with elements k(x n , x), n — 1 . . . N. 

Finally, for given inputs to the execution systems x we have calculated the predictive distribution 
with mean and variance defined as (0 and (0, respectively. Gaussian probability density function 
has one mode which is in the same time mean value, hence the mean value (Q is the most probable 
value for given inputs and the variance ([8]) determines its uncertainty. 

3.2 Covariance function 

Crucial step in modelling any phenomenon using Gaussian processes is the determination of the 
kernel function. There are many kernel functions described in literature (see Ifl2ll for further details), 
e.g., linear kernel 

fc u „(x,x') = x T A- 2 x' (9) 
where A -2 is D x D diagonal matrix, squared exponential kernel 

fc se (x,x') =a 2 exp{(x-x') T A- 2 (x-x')} (10) 
2 Here we use shorthand notation in comparison to equation (2). 
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where a'j is a bias, and complex kernels, for example, 

fc(x, x') = fc se (x, x') + fcj in (x,x') + b, (11) 

where & is a bias parameter. Choosing specific kernel allows to reflect different similarities between 
points (see Fig. [T). 



UJ«JU.«),cuQi3,a[»} (Sfl5 r 4-qa,m3a,u») (i.i^Gd.mjk.aaa.at) 




-I -0.5 0.5 ] -] -0.? 0.5 L -I -0.5 0.5 1 



Figure 1 : Examples from a Gaussian prior defined by the following complex covariance function: 
k (x, x' ) = O exp{ - ^ 1 1 x - x' 1 1 2 ) } + 2 + 6> 3 x T x'. The title above each plot denotes (0 O , 1 , 2 , 03) ■ 
Figure taken from (3) . 



3.3 Learning the hyperparameters 

The predictions of a Gaussian process regression depend mainly on the choice of the covariance 
function. From the practical point of view it is more convenient to propose a parametric set of 
covariance functions than to fix the covariance function by hand. Then the inference of the values 
of hyperparameters can be conducted basing entirely on data. 

In this paper we use the type 2 maximum likelihood procedure which allows to determine hyperpa- 
rameters values by maximizing the log likelihood function (0 denotes a vector of hyperparameters) 



lnp(y|0) = --In |C| - -y C _1 y - - ln(27r). (12) 

If the evaluation of derivatives of C is straightforward we can easily calculate the following deriva- 
tives 
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4 Experiments 



4.1 Preliminaries 

The purpose of the experiment is to examine the prediction quality of the proposed approach. We 
take under consideration a Web service execution environment and average latency of services' re- 
sponses in the system as an output QoS attribute. To reflect the nature of real Web service execution 
system we propose a simulation environment written in Matlab®. The simulation model is presented 
in Fig. |2] The model consists of the following components: (i) teletraffic generator (TG), which 
imitates clients' behaviour by generating Web service requests, (ii) scheduler, which distributes ser- 
vice requests to proper queues, (iii) queues, which maintain demands and work in FIFO fashion,(iv) 
Round Robin (RR), which collects requests from queues in circular order Q6l . (v) execution system, 
which executes the services. 



[Queue jj 




Sink 



[Queue P) 



Figure 2: Schematic diagram of the simulation environment, TG - teletraffic generator, RR - Round 
Robin, ES - execution system. 



The simulation environment provides teletraffic: inputs x to the execution system which are sizes 
of queues, and outputs of the execution systems which are average latencies y. Each quantity is 
calculated using T last observations. In the experiment we compare Gaussian process regression 
with well-known Classification And Regression Trees (CART) method |4| which is a baseline in 
the experiment. We use Matlab implementation of CART and a toolbox for Gaussian processes 
provided by Rasmussen and Nickisch ifTTI . 

4.2 Simulation details 

4.2.1 Modelling teletraffic 

We assume that each demand to the system can occur with a probability p. Then the class of the 
demand is generated with uniform probability and the size of the demand is drawn from lognormal 
distribution^ The process of generating a demand is as follows (assuming some universal time unit, 
e.g., one second): 

1. Generate random number from interval [0, 1]. If it is greater than p, then go to step 2. 
Otherwise go to step 1 . 

2. Generate the demand class using uniform probability. 

3. Generate the size of the demand using lognormal distribution. Go to step 1. 

4.2.2 Modelling execution system 

One demand arrives to the execution system according to the Round Robin scheduler. Execution 
of a demand takes as many universal time units as it is completely executed, i.e., the executed size 

'According to (2), probability distribution function for the size of the file body is lognormal. 
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of the demand is zero. In the simulation environment we need to determine execution sizes per one 
universal time unit for each demand class. 

4.2.3 Prediction and evaluation 

In order to evaluate Gaussian process regression and CART we have to determine a number of 
training points (inputs with outputs) and a number of test points. We use the following quality 
indexes: 

• Mean Absolute Error (MAE); 

• Mean Squared Error (MSE). 

We allow three covariance functions in the simulation environment, i.e., linear kernel (O, squared ex- 
ponential kernel ( fTOb and complex kernel expressed as in equation (fTTT ). Moreover, we use Gaussian 
likelihood with precision /3 _1 (called noise in the simulation environment), and type 2 maximum 
likelihood procedure for hyperparameters learning. 




1 1 ' ' ' ' ' ' ' ' ' 1 

1D0 2D0 300 40D 500 60D 700 BOO 900 1000 



Figure 3: GUI of the simulation environment. 
4.3 Results and discussion 

The GUI of the simulation environment (see Fig. allows to fix all parameters. In order to conduct 
experiments we generated 10 simulations of tele traffic with the following parameters: 

• number of classes - 3; 

• number of test points - 1000; 

• number of training points - 1000; 

• p = 0.5; 

• T = 10; 

• ft- 1 = 0.1; 
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• parameters of lognormal distribution - i) for class 1: 0.5, 0.25, ii) for class 2: 0.1, 0.5, iii) 
for class 3: 0.75,0.15; 

• execution size per one universal time unit: i) for class 1: 1.25, ii) for class 2: 1.5, iii) for 
class 3: 1.1. 

The results of 10 simulations are gathered and represented as a box-and-whisker plot in Fig|4]and|5] 
for MAE and MSE, respectively. The Gaussian process regression performed better in comparison to 
CART for any kernel function. However, the linear kernel appeared to be more proper to represent 
similarity between inputs than squared exponential or complex kernel. Especially, the complex 
kernel is a sum of linear and squared exponential kernels and that is why it performed slightly better 
than squared exponential. 

In order to compare the results, we also performed two sample f-test for MAE at the 5% significance 
level between Gaussian process regression with linear kernel and CART. The null hypothesis, i.e., 
random samples share the same mean and equal but unknown variances, can be rejected with p- 
value equal 1 .8406 x 10~ 5 . Similarly, for MSE, we can reject the null hypothesis with p-value equal 
0.018. In other words, the Gaussian process regression performs statistically better than CART. 
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Figure 4: Box-and-whisker plot for MAE results. Red line represents median value, blue box - 
quantiles, and black lines - range of values. 
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Mean Squared Error 
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Figure 5: Box-and-whisker plot for MSE results. Red line represents median value, blue box - 
quantiles, black lines - range of values, and red cross - outlier. 



5 Conclusions 



In this paper, we have presented the Gaussian process regression as the predictive model for QoS 
attributes in Web service systems. The idea of Gaussian process regression is well-grounded in the 
field of machine learning but its application in Web service systems is novel. In order to evaluate 
the performance of Gaussian process regression the simulation environment was developed. Two 
quality indexes were used, namely, Mean Absolute Error and Mean Squared Error. The results show 
that the Gaussian process performed the best with linear kernel and statistically better comparing to 
CART method. 

The proposed approach shows that application of machine learning methods can develop existing 
computer network systems. The results presented in the paper indicate high accuracy but further 
research and experiments, especially on existing systems, are necessary. Summing up, this paper 
tries to fill the gap between machine learning methods and computer network applications. 
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